I isp Machine Mamuil jOi 



10. Characters and Strings 



Characters and Si rings 



A string is a onc-dimcnsional array representing a sequence of characters. The printed 
representation of a string is its characters enclosed in quotation marks, for example "foo bar" 
Strings arc constants, that is, evaluating a string returns that string. Strings are the ritfu data 
type to use for text-processing. 

Individual characters can be represented by diameter objects or by fixnums A character 
object .s actually the same as a fixnum except that it has a recognizably different data tvpc and 
prints d.nere.uly. Without escaping, a character object is printed bv outputtinq die character it 
represents. With escaping, a character object prints as #\char in Common Lisp syntax or as 
#*/<7*/r in traditional syntax; sec section 10.1.1, page 205 and page 522. By contrast, a fixnum 
would in all cases print as a sequence of digits. Character objects are accepted by most numeric 
functions in place of fixnums, and may be used as array indices. When evaluated they are 
constants. ' 

The character object data type was introduced recently for Common lisp support 
traditionally characters were always represented as fixnums, and nearly all svstcm and user code 
still docs so. Character objects arc interchangeable with fixnums in most contexts but not in eq 
which is often used to compare the result of die stream input operations such as :tyi since that 
might be nil. therefore, die stream input operations still return fixnums diat represent characters 
Aside from tins. Common Lisp functions that return a character return a character object while 
traditional functions return a fixnum. The fixnum which is the character code representing char 
can be written as #/char in traditional syntax. This is equivalent to writing die fixnum using 
digits, but docs not require you to know die character code. 

Most strings arc arrays of type art-string, where each element is stored in eight bits Only 
characters with character code less dian 256 can be stored in an ordinary string; dicse characters 
form die type string-char. A string can also be an array of type art -fat -string, where each 
element holds a sixtccn-bit unsigned fixnum. The extra bits allow for multiple fonts or an 
expanded character set. 

Since strings are arrays, the usual array-referencing function aref is used to extract characters 
from strings. For example, (aref "frob" 1) returns die representation of lower case r The first 
character is at index zero. 

Conceptually, the elements of a string are character objects. This is what Common Lisp 
programs expect to see when they do aref (or char, which on die Lisp Machine is synonymous 
with aref) on a string. But nearly all Lisp Machine programs are traditional, and expect elements 
of strings to be fixnums. Therefore, aref of a string actually returns a fixnum. A distinct version 
of aref exists for Common Lisp programs. It is cli:aref and it docs return character objects if 
given a string. For all other kinds of arrays, aref and cli:aref are equivalent 

(aref "Foo" 1) => #ol57 

(cli :aref "Foo" 1) => #*/ 

It is also legal to store into strings, for example using setf of aref. As with rplaca on lists 
this changes the actual object; you must be careful to understand where side-effects will 
propagate. It makes no difference whether a character object or a fixnum is stored. When you 



PS:<L.MAN>FD-STR.TEXT.27 8-JUN-84 



Characters 



204 I isp Machine Maniuil 



arc making strings thai von intend to change later, you probably want to create an array with a 
fill-pointer (sec page 166) so that you can change the length of the siring as well as the contents. 
The length of a siring is always computed using array -active -length, so that if a string has a 
fill-pointer, its value is used as the length. 

The functions described in this section provide a variety of useful operations on strings. In 
place of a string, most of these functions accept a symbol or a fixnum as an argument, coercing 
it into a string^ Given a symbol, its print name, which is a string, is used. Given a fixnum, a 
one-character String containing the character designated by that fixnum is used. Several of the 
functions actually work on any type of one-dimensional array and may be useful for other than 
siring processing; these arc the functions such as substring and string- length which do not 
depend on the elements of the string being characters. 

The generic sequence functions in chapter 9 may also be used on strings. 

10.1 Characters 

The Lisp Machine data type for character objects is a recent addition to the system. Most 
programs still use fixnums to represent characters. 

Common Lisp programs typically work with actual character objects but programs traditionally 
use fixnums to represent characters. The new Common Lisp functions for operating with 
characters have been implemented to accept fixnums as well, so that they can be used equally 
well from traditional programs. 

characterp object 

t if object is a character object; nil otherwise. In particular, it is nil if object is a fixnum 
such as traditional programs use to represent characters. 

character object 

Coerces object to a single character, represented as a fixnum. If object is a number, it 
is returned. If object is a string or an array, its first element is returned. If object is a 
symbol, the first character of its pnamc is returned. Otherwise an error occurs. The way 
characters are represented as fixnums is explained in section 10.1.1, page 205. 

cl1: character object 

Coerces object into a character and returns the character as a character object for Common 

Lisp programs. 

1nt-char fixnum . . 

Converts fixnum, regarded as representing a character, to a character object. This is a 

special case of clixharacter. (int-char #o101) is the character object for A. If a 
character object is given as an argument, it is returned unchanged. 



PS:<L.MAN>Fl>STR.TKXT.27 8-JUN-84 



I isp Machine Manual ->ns 

" VJ Characters 

char-1nt char 

Converts char, a character object, to the fixmim which represents the same character 
I his is the .nverse of int-char. It may also be given a fixnum as argument, in which 
case the value is the same fixnum. 

10.1.1 Components of a Character 

A character object, or a fixnum which is interpreted as a character, contains three separate 
pieces of information: the charade, code, the font number, and the modifier bits, h.ch „f these 
things is an integer from a fixed range. The character code ranges from Oto 377 (octal) the font 
number from to 377 (octal), and the modifier bits from to 17 (octal). These numeric 

r^tS blw PPCi ' r " PmgramS: inStCad ' USC UlC C(),1SUmt ^'-'-har-code-limit, «"d 

Ordinary strings can hold only characters whose font number and modifier bits are zero Fat 
strings can hold characters with any font number, but the modifier bits must still be zero. ' 

Character codes less than 200 octal are printing graphics; when output to a device they are 
assumed to print a character and move the cursor one character position to the right (All 
lTtcmHy ) Pr ° V,dCS f ° r variablc - width fon^ so foe term "character position" shouldn't be taken too 

h Character codes 200 through 236 "octal are used for special characters. Character 200 is a 

null character , which docs not correspond to any key on the keyboard. The null character is 

not used for anything much; fasload uses it internally. Characters 201 through 236 correspond to 

l^T!^ T thC kCyb ° ard SUCh 3S RetUm and Ca,L Thc rcmainin « ^racier 
codes 237 through 377 octal are reserved for future expansion. 

Most of the special characters do not normally appear in files (although it is not forbidden for 
files to contain them). These characters exist mainly to be used as "commands" from the 
keyboard. A few special characters, however, are "format effectors" which arc just as legitimate 
as printing characters in text files. Thc names and meanings of these characters are: 

Return The "newline" character, which separates lines of text. We do not use the PDP- 

10 convention which separates lines by a pair of characters, a "carriage return" 
and a "linefeed". 

Page The "page separator" character, which separates pages of text. 

Tab The "tabulation" character, which spaces to foe right until foe next "tab stop" 

lab stops are normally every 8 character positions. 

The space character is considered to be a printing character whose printed image happens to 
be blank, rather than a format effector. 

When a letter is typed with any of foe modifier bit keys (Control, Meta, Super or Hyoer) 
the letter is normally upper-case. Jf foe Shift key is pressed as well, then the letter becomes 
ower-case I his is exactly the reverse of what foe Shift key docs to letters without control bits. 
(J he Shift-lock key has no effect on letters with control bits.) 



PS:<l..MAN>FI>STR.THXT.27 8 _ JUN _ 84 



Characters 



2()b lisp Machine Manual 



char -code char 

char-font char 

char- bits char 

Return the character code of char, the font number of char, and the modifier bits value 
of char, char may be a fixnum or a character object; the value is always a fixnum. 

These used to be written as 

( Idb r/.ch-char char) 

(ldb Men- font char) 

(Idb '/.'/.ch-control-meta char) 
Such use of Idb is frequent but obsolete. 

char- code limit Constant 

A constant whose value is a bound on the maximum code of any character. In the Lisp 
Machine, currently, it is 400 (octal). 

charfont-Hm1t Constant 

A constant whose value is a bound on die maximum font number value of any character. 
In the Lisp Machine, currently, it is 400 (octal). 

char-b1ts-lim1t Constant 

A constant whose value is a bound on the maximum modifier bits value of any character. 
In the Lisp Machine, currently, it is 20 (octal). Thus, there are four modifier bits. These 
are just the familiar Control, Mcta, Super and Hyper bits. 

char -control -bit Constant 

char-meta-b1t Constant 

char-super-b1t Constant 

char-hyper-b1t Constant 

Constants with values 1, 2, 4 and 8. These give the meanings of the bits within the bits- 
field of a character object. Thus, (bit-test char-meta-bit (char-bits char)) would be 

non-nil if char is a mcta-charactcr. (This can also be tested with char-bit.) 

char-b1t char name 

t if char has the modifier bit named by name, name is one of the following four 
symbols: .control, :meta, :super, and :hyper. 
(char-bit #\meta-x :meta) => t. 

set-char-b1t char name newvalue 

Returns a character like char except that the bit specified by name is present if newvalue 
is non-nil, absent otherwise. Thus, 

(set-char-bit #\x :meta t) => #\meta-x. 
The value is a fixnum if char is one; a character object if char is one. 

Until recently the only way to access the character code, font and modifier bits was with Idb, 
using the byte field names listed below. Most code still uses that method, but it is obsolete; 
char- bit should be used instead. 

%%kbd-char 

%%ch-char Specifies the byte containing the character code. 



PS:<L.MAN>FI>STR.TKXT.27 8-JUN-84 



I isp Midline Manual ~>m , „ 

~ w/ diameters 

%%ch-font ■ Specifics- die byte containing the font number. 
%%kbd -control 

Specifies the byte containing the Control bit. 
%%kbd-meta Specifics the byte containing the Meta bit. 
%%kbd -super Specifies the byte containing the Super bit. 
%%kbd-hyper Specifics the byte containing Uie Ilypcr bit. 
%%kbd- control -meta 

Specifies the byte containing all the modifier bits. 

Characters arc sometimes used to represent mouse clicks. The character savs which button was 
pressed and how many times. Refer to the Window System ma.u.,1 for an 'explanation of how 
these characters arc generated. 

tv:kbd-mouse-p char 

t if char is a character used to represent a mouse click. Such characters are always 
distinguishable from characters that represent keyboard input. 

%%kbd-mouse-button Comtmt 

I he value of %%kbd -mouse-button is a byte specifier for the field in a mouse signal 
that says which button was clicked. The byte contains 0, 1, or 2 for the left, middle or 
right button, respectively. 

%%kbd-mouse-n-cl1cks Comlam 

I he value of %%kbd-mouse-n-clicks is a byte specifier for die field in a mouse signal 

that says how many times die button was clicked. The byte contains one less dian die 
number of times die button was clicked. 

10.1.2 Constructing Character Objects 

code-char code &optional (bits 0) (fontO) 

make-char code &optional (too) (fontO) 

Returns a character object made from code, bits and font. Common Lisp says that not 
all combinations may be valid, and diat nil is returned for an. invalid combination. On 
the Lisp Machine, any combination is valid if die arguments are valid individually. 

According to Common Lisp, code-char requires a number as a first argument, whereas 
make-char requires a character object, whose character code is used. On the Lisp 
Machine, cither function may be used in either way. 

d1g1t-char weight Aoptional (radix W.) (fontO) 

Returns a character object which is die digit with die specified weight, and with font as 
specified. However, if there is no suitable character which has weight weight in the 
specified radix, the value is nil. If the "digit" is a letter (which happens if weight is 
greater than 9), it is returned in upper case. 



PS:<L.MAN>Fl>STR.Ti:XT.27 8-JUN-84 



Characters 



)()X lisp Machine Manual 



tv: make-mouse- char button n- clicks 

Returns the fixiuim character code that represents a mouse click in the standard way. 
tv:mouse-char-p of this value is t. button is for the leftbutton, 1 for the middle 
button, or 2 for the right button, n-clicks is one less than the number of clicks (1 for a 
double click, normally). 

10.1.3 The Character Set 

Here are the numerical values of the characters in the Zetalisp character set. It should never 
be necessary for a user or a source program to know these values. Indeed, they arc likely to be 
changed in the future. There arc symbolic names for all characters; sec the section on character 
names, below. 

It is worth pointing out that the Zetalisp character set is different from the ASCII character 
set. File servers operating on hosts that use ASCII for storing text files automatically perform 
character set conversion when text files arc read or written. The details of the mapping arc 
explained in section 25.8, page 607. 



PS:<LMAN>FI>STR.TEXT.27 8-JUN-84 



isp Machine Manual 209 



Chanictcrs 



a 



000 center-dot (•) 40 space 100 i 40 

001 down arrow (*) W ! 101 A 'mi 

002 al P ha («) 042 - 102 B 42 b 

003 beta U» 043 # ■ 103 C 43 c 

004 and-sign (a) 44 $ 104 D 144 d 

005 not-sign {-) 45 % 105 E 145" 

006 epsilon (,, 046 & 1Q6 f 

007 P1 <*> 047 ' 107 G 147 a 
0J0 lambda (A) 50 ( no H 150 h 

011 gamma (y) 51 ) m I ,., , 

012 delt * (*) 052 * n2 j 152 

013 uparrow (t) 53 + 113 K 153 jj 

014 plus-minus (±) 54 , 114 L 154 , 

1)1 rf CTe ; pl r, (e) ° 55 - n5 m ™ « 

016 infinity (00) 056 # U6 N 

017 partial delta (3) 57 / 117 157 

020 left horseshoe (c) 060 120 P 

021 right horseshoe (b) 061 1 121 



160 p 



Q 161 



022 up horseshoe (fl) 062 2 122 R i6 2 r 

023 down horseshoe (U) 063 3 123 S 163 s 

024 universal quantifier (V) 064 4 124 T 164 t 

025 existential quantifier (3) 065 5 125 U 165 u 

026 circle-X (•) 66 6 126 V 166 v 

027 double-arrow (») 67 7 127 W i 6 7 w 

030 left arrow (♦-) 70 8 130 X 170 x 

031 right arrow (-►) 71 9 131 Y 171 

032 not-equals (•) 72 : 132 Z 172 2 

033 diamond (altmode) (♦) 073 ; 133 [ 173 / 

034 less-or-equal (<) 074 < 134 \ 174 } 

035 greater-or-equal (£) 075 = 135 ] 175 { 

036 equivalence (s) 076 > 136 - 176 

037 or < v > 077 ? 137 177 / 
200 Null character 210 Overstrike 220 Stop-output 230 Roman-iv 

III riL 2H ^ 221 Ab0rt 231 Han <*-up 

HI C r ] e * r 212 Line 222 Resume 232 Hand-down 

"? t .■ , 213 Delete 223 Status 233 Hand-left 

In* l emir lt ! SC3Pe 214 Pa9G 224 End 234 Hand-right 

205 Macro/backnext 215 Return 225 Roman-i 235 System 

206 Hel P 216 Quote 226 Roman-i i 

207 Rubout 217 Hold-output 227 Roman-iii 
237-377 reserved for the future 



The Lisp Machine Character Set 
(all numbers in octal) 



236 Network 



FS:<L.MAN>F1)-STR.THXT.27 



8-JUN-84 



Chameters 210 I isp Machine Manual 



10.1.4 Classifying Characters 

strirtg-char-p char 

t il* char is a eharaclcr thai can be stored in a string. On the I .isp Machine, this is true 
if the font and modifier bits o\' char are zero. 

standlard-char-p char 

t if char is a standard Common I isp character: any of the 95 ASCII printing characters 
(including Space), and the Return character. Thus (standard- char -p #\end) is nil. 

graph Ic-char-p char 

t if char is a graphic character: one which has a printed shape. A, -, Space and r arc 
all graphic characters; Return, End and Abort arc not. A character whose modifier bits 
arc nonzero is never graphic. 

Ordinary output to windows prints graphic characters using the current font. Nongraphic 
characters arc printed using lozenges unless they have special formatting meanings (as 
Return docs). 

alpha-char-p char 

t if char is a letter with zero modifier bits. 

d1g1t-char-p char &optional (radix 10.) 

If char is a digit available in the specified radix, returns the weight of that digit. 
Otherwise, it returns nil. If the modifier bits of char arc nonzero, the value is always nil. 
(It would be more useful to ignore the modifier bits, but this decision provides Common 
Lisp with a foolish consistency.) Fxamples: 

(digit-char-p #\8 8) »> nil 

(digit-char-p #\8 9) => 8 

(digit-char-p #\F 16.) => 15. 

(digit-char-p #\c-8 anything) ■> nil 

alphanumeric^ char 

t if char is a letter or a digit through 9, with zero modifier bits. 

10.1.5 Comparing Characters 

char-equal &rcst chars 

This is the primitive for comparing characters for equality; many of the string functions 
call it. The arguments may be fixnums or character objects indiscriminately. The result is 
t if the characters are equal ignoring case, font and modifier bits, otherwise nil. 

char not-equal &rcst chars 

t if the arguments arc all different as characters, ignoring case, font and modifier bits. 



PS:<LJvtAN>Fl>STR.TEXT.27 8-JUN-84 



I isp Machine Manual ?]l 

- f ' C haractcrs 

char-lessp &rest chars 

char-greaterp &rcst chars' 

char-not-lessp &rcst chars 

char-not-greaterp &rcst chars 

Ordered comparison of characters, ignoring case, font and modifier bits These arc the 
primitives for comparing characters for order; many of the string functions call it The 
arguments may be fixnums or character objects. The result is t if the arguments are in 
strictly increasing (strictly decreasing, nonincrcasing, nondecrcasing) order. Details of die 
ordering of characters are in section 10.1.1, page 205. 

char= chart &rcst chars 

char//= chart &rcst chars 

char> chart &rcst chars 

char< chart &rcst chars 

char>* chart &rcst chars 

char<= chart &rcst chars 

These are the Common Lisp functions for comparing characters and including the case 
font and bits in die comparison. On the Lisp Machine they are svnonyms for the 
numeric comparison functions =, >, etc. Note that in Common Lisp syntax you would 
write char/ = , not char// = . 

10.1.6 Character Names 

Characters can sometimes be referred to by long names; as, for example in the #\ 
construct in Lisp programs. Every basic character (zero modifier bits) which is not a graphic 
character has one or more standard names. Some graphic characters have standard names too 
When a non-graphic character is output to a window, it appears as a lozenge containing the 
character s standard name. 

char-name char 

Returns the standard name (or one of the standard names) of char, or nil if there is 
none. The name is returned as a string, (char-name #\space) is the string "SPACE". 

If char has nonzero modifier bits, the value is nil. Compound names such as Control-X 
arc not constructed by this function. 

name -char name 

Returns (as a character object) the character for which name is a name, or returns nil if 
name is not a recognized character name, name may be a symbol or a string. Compound 
names such as Control-X arc not recognized. 

read uses this function to process the #\ construct when a character name is 
encountered. 

The following arc the recognized special character names, in alphabetical order except with 
synonyms together. Character names are encoded and decoded by the functions char-name and 
name -char (page 211). 



PS:<L.MAN>FI>STR.TEXT.27 8-JUN-84 



Conversion to I Ippcr or I ower Case 212 I isp Machine Manual 



l ; irsi a list of the special function keys. 

abort break call clear -input, clear 

delete end hand -down hand -left 

hand-right hand-up help hold-output 

line, If macro, back-next network 

overstrike, backspace, bs page, form, clear-screen 

quote resume return, cr 

roman-i roman-ii roman-iii roman-iv 

rubout space, sp status stop -output 

system lab terminal, esc 

These arc printing characters that also have special names because they may be hard to type 
on the hosts that arc used as file servers. 

altmode circle -plus delta gamma 

integral lambda plus- minus uparrow 

center-dot down -arrow alpha beta 

and -sign not-sign epsilon pi 

lambda gamma delta up -arrow 

plus-minus circle-plus infinity partial -delta 

left -horseshoe right -horseshoe up- horseshoe down -horseshoe 

universal-quantifier existential-quantifier 

circle-x double-arrow left-arrow right-arrow 

not-equal altmode less-or-equal greater-or-equal 

equivalence or -sign 

The following names arc for special characters sometimes used to represent single and double 
mouse clicks. The buttons can be called either I, m, r or 1, 2, 3 depending on stylistic 
preference. 

mouse-l-1 or mouse -1-1 mouse-l-2ormouse-1-2 

mouse-m-1 or mouse-2-1 mouse-m-2 or mouse-2-2 

mouse-r-1 ormouse-3-1 mouse-r-2ormouse-3-2 

10.2 Conversion to Upper or Lower Case 

upper-case-p char 

t if char is an upper case letter with zero modifier bits. 

lower- case -p char 

t if char is an lower case letter with zero modifier bits. 

both- case -p char 

This Common Lisp function is defined to return t if char is a character which has distinct 
upper and lower case forms. On the Lisp Machine it returns t if char is a letter with 
zero modifier bits. 



PS:<lJv1AN>Fl>STR.TKXT.27 8-JDN-84 



isp Machine Manual 2 IJ Comersion to Upper or lower Case 



char-upcase char 

If char, is a lower-case alphabetic ch<iracter its upper-case form is returned; otherwise, 
char itself is returned. If font information or modifier bits arc present, they arc preserved. 
If char is a fixnum, the value is a fixnum. If char is a character object, the value is a 
character object. 

char-downcase char 

Similar, but converts to lower case. 

stMng-upcase string &key (start 0) end 

Returns a string like string, with all lower-case alphabetic characters replaced by the 
corresponding upper-case characters. If start or end is specified, only the specified portion 
of the string is converted, but in any case the entire string is returned. 

The result is a copy of string unless no change is necessary, string itself is never 
modified. 

stMng-downcase string &kcy (start 0) end 
Similar, but converts to lower case. 

string -capitalize string &key (start 0) end 

Returns a string like string in which all, or the specified portion, has been processed by 
capitalizing each word. For this function, a word is any maximal sequence of letters or 
digits. It is capitalized by putting the first character (if it is a letter) in upper case and 
any letters in the rest of the word in lower case. 

The result is a copy of string unless no change is necessary, string itself is never 
modified. 

nstMng-upcase string &key (start 0) end 
nstrlng-downcase string &key (start 0) end 
nstr1ng-cap1tal1ze string &key (startO) end 

Like the previous functions except that they modify -string itself and return it. 

string-capital 1ze-words string &optional (copy-pi) (spaces*) 

Puts each word in string into lower-case with an upper case initial, and if spaces is non- 
nil replaces each hyphen character with a space. 

If copy-p is t, the value is a copy of string, and string itself is unchanged. Otherwise, 
string itself is returned, with its contents changed. 

This function is somewhat obsolete. One can use string -capitalize followed optionally by 
string-subst-char. 

Sec also the format operation ~(...~) on page 488. 



PS:<LMAN>H>STIU'HXT.27 8-JUN-84 



Uasic String Opera! ions 



214 lisp Machine Manual 



10.3 Basic String Operations 

make- string size &kcy {initial-element Q) 

Creates and returns a string of length size, with each clement initialized to initial-element, 
which may be a fixnum or a character. 

string x 

Coerces a into a string. Most of the string functions apply this to their string arguments. 
If x is a string (or any array), it is returned. If .v is a symbol, its pnamc is returned. If 
.v is a non-negative fixnum less than 400 octal, a onc-character-long string containing it is 
created and returned. If v is an instance that supports the .string- for -printing operation 
(such as, a pathname) then the result of that operation is returned. Otherwise, an error is 
signaled. 

If you want to get the printed representation of an object into the form of a string, this 
function is not what you should use. You can use format, passing a first argument of nil 
(sec page 483). You might also want to use with-output-to-string (sec page 474). 

string- length string 

Returns the number of characters in string. This is 1 if siring is a number or character 
object, the array-active-length (sec page 174) if string is an array, or the array-active- 
length of the pnamc if string is a symbol. 

string-equal slringl sfring2 &kcy (start 10) (start20) endl endl 

Compares two strings, returning t if they arc equal and nil if they are not. The 
comparison ignores the font and case of the characters, equal calls string-equal if 
applied to two strings. 

The keyword arguments start I and start2 arc the starting indices into the strings, endl 

and end2 arc the final indices; the comparison stops just before the final index, nil for 

end! or end2 means stop at the end of the string. 

Examples: 

(string-equal "Foo" "foo") => t 
(string-equal "foo" "bar") => nil 
(string-equal "element" "select" 1 3 4) => t 

An older calling sequence in which the start and end arguments are positional rather than 
keyword is still supported. The arguments come in the order start! start2 endl end2. 
This calling sequence is obsolete and should be changed whenever found. 

string-not-equal slringl string2 &kcy (startlO) endl (start20) end2 
(not (string -equal ...» 

String- string! slring2 &kcy (startlO) (start20) endl end2 

is like string -equal except mat case is significant. 

(string= "A" "a") => nil 



PS:<1 ,.M AN>F1>STR.TEXT.27 8-JUN-84 



I isp Machine Manual 71 s \> ■ c, • ^ 

- ,:) IJasic String Operations 

string* string string* &kcy (start! 0) end/ (start} 0) eml2 
str1ng//= string/ stringJ ■ &kcy (.v/,//-//0) r//,// (*/,//■/.? 0) mtf 

(not (string = ...)). Note that in Common I.isp syntax vou would write strinq/= not 
string// = . J 

string-lessp string/ string! &kcy (start 10) end/ (start 20) end} 
strlng-greaterp string! string} &key (shirt! 0) end! (start} 0) end} 
strlng-not-greaterp string! string} &kcy (start! 0) end! (start}0) end} 
string not-lessp siring! string} &key (star// 0) end! (star(}0) end} 

Compare all or the specified portions of string! and string using dictionary order 
Characters arc compared using char-lessp and char-equal so that font and alphabetic 
case arc ignored. 

You can use these functions as predicates, but they do more. If the strings fit the 
condition (e.g. string! is strictly less in string-lessp) then the value is a number the 
index in string! of the first point of difference between the strings. This equals the length 
of string! if the strings match. If the condition is not met, the value is nil. 
(string-lessp "aa" "Ab") => 1 
(string-lessp "aa" "Ab" :endl 1 :end2 1) => nil 
(string-not-greaterp "Aa" "Ab" :endl 1 :end2 1) => 1 

str1ng< string! string} &kcy (start 0) end! (start} 0) end} 
str1ng> string! string} &kcy (start 0) end! (start} 0) end} 
str1ng>= string! string} &kcy (start 0) end! (start}Q) end2 
str1ng<= string! string} &kcy (start 0) end! (start} 0) end} 
Str1ng< string/ string} &key (start/ 0) end! (start} 0) end} 
Str1ng> string! string} &kcy (start 0) end/ (start20) end} 

Like string-lessp, etc., but treat case and font as significant when comparing characters 

(string< "AA" "aa") => 

(string-lessp "AA" "aa") => nil 

string -compare string! string} &optional (start 0) (start} 0) end! end2 

Compares two strings using dictionary order (as defined by char-lessp). The arguments 
are interpreted as in string -equal. The result is if the strings are equal, a negative 
number if string! is less than string}, or a positive number if string! is greater than 
string}. If die strings are not equal, the absolute value of the number returned is one 
greater than the index (in string!) where the first difference occurred. 

substring string start &optional end area 

Extracts a substring of string, starting at the character specified by start and going up to 
but not including the character specified by end. start and end arc 0-origin indices. The 
length of the returned string is end minus start. If end is not specified it defaults to the 
length of string. The area in which the result is to be conscd may be optionally specified 
Example: 

(substring "Nebuchadnezzar" 4 8) => "chad" 



PS:<L.MAN>FI>STR.TEXT.27 8-JUN-84 



,,...- ,> ,• i\h I isn Machine Manual 

Basic String Operations - l » ' 



nsubstring string start &optional end area 

Is like substring except that the substring is not copied: instead an indirect array (sec 
page 167) is created which shares part of the argument siring. Modifying one string will 
modify the other. 

Note that nsubstring does noi necessarily use less storage than substring: an nsubstring 
of any length uses at least as much storage as a substring 12 characters long. So you 
shouldn't use this for cfliciciicj : ■ it is intended for uses in which it is important to have a 
substring which, if modified, will cause the original string to be modified too. 

string-append &rcst strings . 

' Copies and concatenates any number of strings into a single string. With a single 
argument, string -append simply copies it. If there are no arguments, the value is an 
empty string. In fact, vectors of any type may be used as arguments, and the value is a 
vector capable of holding all the elements of all the arguments. Thus string-append can 
be used to copy and concatenate any type of vector. If the first argument is not an array 
(for example, if it is a character), the value is a string. 

Example: 

(string-append 0\! "foo" #W) => "I fool" 

string-nconc modified- string &rest strings 

Is like string-append except that instead of making a new string containing the 
concatenation of its arguments, string-nconc modifies its first argument, modified- string 
must have a fill-pointer so that additional characters can be tacked onto it. Compare this 
with array-push-extend (page 178). The value of string-nconc is modified- string or a 
new, longer copy of it; in the latter case the original copy is forwarded to the new copy 
(sec adjust-array-size, page 176). Unlike nconc, string-nconc with more than two 
arguments modifies only its first argument, not every argument but the last 

string-trim char-set string . . 

Returns a substring of string, with all characters in char-set stripped off the beginning 
and end. char-set is a set of characters, which can be represented as a list of characters, 
a string of characters or a single character. 

(string-trim '(#\sp) " Dr. No ") => "Dr. No" 
(string-trim "ab" "abbafooabb") => "foo" 

str1ng-left-tr1m char-set string . . 

Returns a substring of string, with all characters in char-set stripped off the beginning. 

char-set is a set of characters, which can be represented as a list of characters, a string of 
characters or a single character. 

Str1mg-r1ght-tr1m char-set string 

Returns a substring of string, with all characters in char-set stripped off the end. char-set 
is a set of characters, which can be represented as a list of characters, a string of 
characters or a single character. 



PS:<L.MAN>FI>STR:1T:X1.27 8-JUN-84 



isp Machine Manual 217 llasic String Operations 



string-remove fonts string 

Returns a copy of string with each character truncated to 8 bits; that is, changed to font 
zero. 

If string is an ordinary string of array type art-string, this does not change anything, but 
it makes a difference if siring is an art-fat-string. 

string-reverse string 

stMng-nreverse string 

Like reverse and nreverse, but on strings only (see page 190). There is no longer any 
reason to use these functions except that they coerce numbers and symbols into strings 
like the other string functions. 

string-plural 1ze string 

Returns a string containing the plural of the word in the argument string. Any added 

characters go in die same case as the last character of string. 

Example: 

(string-plural ize "event") => "events" 

(string-plural ize "trufan") => "trufen" 

(string-plural ize "Can") => "Cans" 

(string-pluralize "key") => "keys" 

(string-pluralize "TRY") => "TRIES" 
For words with multiple plural -forms depending on the meaning, string-pluralize cannot 
always do the right thing. 

str1ng-select-a-or-an word 

Returns "a" or "an" according to the string word; whichever one appears to be correct 
to use before word in English. 

strlng-append-a-or-an word 

Returns the result of appending "a " or "an ", whichever is appropriate, to die front of 
word. 

%str1ng-equal stringl start! string! start! count 

%string -equal is the microcode primitive used by string -equal. It returns t if the count 
characters of stringl starting at startl are char-equal to the count characters of string! 
starting at start!, or nil if the characters arc not equal or if count runs off the length of 
either array. 

Instead of a fixnum, count may also be nil. In this case, %string -equal compares the 
substring from start! to (string -length string!) against the substring from start! to 
(string -length string!). If the lengths of these substrings differ, then they are not equal 
and nil is returned. 

Note that string! and string! must really be strings; the usual coercion of symbols and 
fixnums to strings is not performed. This function is documented because certain 
programs which require high efficiency and arc willing to pay the price of less generality 
may want to use %string -equal in place of string -equal. 



PS:<L.MAN>H>STR.TEXT.27 8-JUN-84 



String Searching 218 I j S p Machine Manual 



Examples: 

To compare the two strings foo and bar. 
('/string -equal foo bar nil) 
To see if the slring;/<w starts with the characters "bar": 
('/.string-equal foo "bar" 3) 

alphabet1c-case-affects-str1ng-compar1son Variable 

If this variable is t, the functions %string -equal and %string -search consider case (and 
font) significant in comparing characters. Normally this variable is nil and those primitives 
ignore differences of case. 

This variable may be bound by user programs around calls to %string- equal and 
%string -search -char, but do not set it globally, for that may cause system malfunctions. 

10.4 String Searching 

string-search-char char string &optional (fromO) to consider case 

Searches through siring starting at the index from, which defaults to the beginning, and 
returns the index of the first character that is char-equal to char, or nil if none is 
found. If to is non-nil, it is used in place of (string -length string) to limit the extent of 
the search. 
Example: 

(string-search-char #\a "banana") => 1 
Case (and font) is significant in comparison of characters if considcr-casc is non-nil. In 
other words, characters arc compared using char = rather than char-equal. 

(string-search-char #\a "BAnana" nil t) => 3 

%str1ng-search-char char string from to 

%string- search -char is the microcode primitive called by string -search -char and other 
functions, string must be an array and char, from, and to must be fixnums. The 
arguments are all required. Case-sensitivity is controlled by the value of the variable 
alphabetic-case-affects-string -comparison rather than by an argument. Except for 
these these differences, %string -search -char is the same as string -search -char. This 
function is documented for the benefit of those who require the maximum possible 
efficiency in string searching. 

string-search-not-char char string &optional (fromO) to consider-case 

Like string -search -char but searches string for a character different from char. 
Example: 

(string-search-not-char #\B "banana") => 1 
(string-search-not-char #\B "banana" nil t) «> 

string-search key string &optional (fromO) to (key from 0) key to consider-case 

Searches for the string key in the string siring. The search begins at from, which defaults 
to the beginning of string. The value returned is the index of die first character of the 
first instance of key, or nil if none is found. If to is non-nil, it is used in place of 
(string - length string) to limit the extent of the search. 



PS:<L.MAN>ED-STR.TEXT.27 8-JUN-84 



isp Machine Manual 219 Siring Searching 



The .arguments keyfrom and key to can be used to specify the portion of key to be 
searched for, rather than all of key. 

Case and font arc significant in character comparison if consider-ease is non-nil. 

Example: 

(string-search "an" "banana") => 1 
(string-search "an" "banana" 2) => 3 
(string-search "tank" "banana" 2 nil 1 3) => 3 
(string-search "an" "BAnaNA" nil nil t) => nil 

string-search- set char set string &optional (JivmO) to consider-ease 

Searches through string looking for a character that is in char-set. char-set is a set of 
characters, which can be represented as a sequence of characters or a single character. 

The search begins at the index from, which defaults to the beginning. It returns the 
index of the first character thai is char-equal to some clement of char-set, or nil if none 
is found. If to is non-nil, it is used in place of (string -length siring) to limit the extent 
of the search. 

Case and font arc significant in character comparison if consider-ease is non-nil. 
Example: 

(string-search-set *(#\n #\o) "banana") => 2 

(string-search-set "no" "banana") => 2 

string -search-not-set char- set string &optional (fromO) to consider-ease 
Like string -search -set but searches for a character that is not in char-set. 
Example: 

(string-search-not-set '(#\a #\b) "banana") => 2 

string- reverse-search- char char string &optional from (toO) consider-ease 

Searches through string in reverse order, starting from the index one less than from (nil 
for from starts at the end of string), and returns the index of the first character which is 
char-equal to char, or nil if none is found. Note that the index returned is from the 
beginning of the string, although the search starts from the end. The last (leftmost) 
character of string examined is the one at index to. 

Case and font are significant in character comparison if consider-ease is non-nil. In this 

case, char= is used for the comparison rather than char-equal. 

Example: 

(string-reverse-search-char #\n "banana") => 4 

str1ng-reverse-search-not-char char string &optional from (toO) consider-ease 

Like string -reverse-search -char but searches for a character in string that is different 

from char. 

Example: 

(string-reverse-search-not-char #\a "banana") => 4 

;4 is the index of the second "n" 



PS:<LMAN>I'D-STR.TEXT.27 8-JUN-84 



String Searching 220 I isp Machine Manual 



string-reverse-search key string ^optional from (loO) (keyfromO) key-to consider-case 
Searches for the string key in the string siring. The search proceeds in reverse order, 
starting from the index one less than from, and returns the index of the first (leftmost) 
character of the first instance found, or nil if none is found. Note that the index returned 
is from the beginning of the string, although the search starts from the end. The from 
condition, restated, is thai the instance of key found is the rightmost one whose rightmost 
character is before the from'ih character of siring, nil for from means the search starts at 
the end of siring. The last (leftmost) character of string examined is the one at index to. 

Example: 

(string-reverse-search "na" "banana") => 4 

The arguments keyfrom and key- to can be used to specify the portion of key to be 
searched for, rather than all of key. Case and font are significant in character comparison 
if consider- ease is non-nil. 

str1ng-reverse-search-set char-set string &optional from (toO) consider- case 

Searches through siring in reverse order for a character which is char-equal to some 

clement of char set. chat- set is a set of characters, which can be represented as a list of 
characters, a string of characters or a single character. 

The search starts from an index one less than from, and returns the index of the first 
suitable character found, or nil if none is -found, nil for from means the search starts at 
the end of siring. Note that die index returned is from the beginning of the string, 
although the search starts from the end. The last (leftmost) character of siring examined 
is the one at index to. 

Case and font are significant in character comparison if consider-case is non-nil. In this 
case, char = is used for the comparison rather than char-equal. 

(string-reverse-search-set "ab" "banana") => 5 

str1ng-reverse-search-not-set char-set string &optiona1 from (toO) consider-case 
Like string -reverse-search -set but searches for a character which is not in char-set. 
(string-reverse-search-not-set '(#\a #\n) "banana") => 

stMng-subst-char new char old- char string (copy-pt) (retain-font-pt) 

Returns a copy of string in which all occurrences of old-char have been replaced by new- 
char. 

Case and font arc ignored in comparing old-char against characters of string. Normally 
the font information of the character replaced is preserved, so that an old-char in font 3 
is replaced by a new-char in font 3. If retain- font-p is nil, the font specified in new-char 
is stored whenever a character is replaced. 

If copyp is nil, string is modified destructively and returned. No copy is made. 



PS:<LMAN>F1>STR.THXT.27 8-JUN-84 



I isp Machine Manual ' 221 Maclisp-Compaliblc lumctions 



substring after-char char string &opiional start end area 

Returns a copy of the portion of string that follows the next- occurrence of char after 
index stun. The portion copied ends at index end. If char is not found before end, a 
null string is returned. 

The value is conscd in area area, or in default-cons-area, unless it is a null string. 
start defaults to zero, and end to the length of siring. 

Sec also make-symbol (page 133), which given a string makes a new unintcmed symbol with 
that print name, and intern (page 645), which given a string returns the one and only symbol (in 
the current package) with that print name. 

10.5 Maclisp-Compatible Functions 

The following functions arc provided primarily for Maclisp compatibility. 

alphalessp string! string2 

(alphalessp stringl string2) is equivalent to (string -lessp 'stringl string2). 

samepnamep syml sym2 

This predicate is equivalent to string = . 

get char string index 

Returns the indexWi character of string as a symbol. Note that 1-origin indexing is used. 
This function is mainly for Maclisp compatibility; aref should be used to index into 
strings (but aref does not coerce symbols or numbers into strings). 

getcharn string index 

Returns the index'th character of string as a fixnum. Note that 1-origin indexing is used. 
This function is mainly for Maclisp compatibility; aref should be used to index into 
strings (but aref does not coerce symbols or numbers into strings). 

asc11 x 

Like character, but returns a symbol whose printname is the character instead of 

returning a fixnum. 
Examples: 

(ascii #ol01) => A 

(ascii #o56) => /. 
The symbol returned is interned in the current package (see chapter 27, page 636). 

ma k nam char- list 

Returns an uninterned symbol whose print-name is a string made up of the characters in 

char-list. 

Example: 

(maknam '(a b #\0 d)) => abOd 



PS:<LMAN>FI>STR.TEXT.27 8-JUN-84 



Maclisp-Compatible lunctions 222 I .isp Machine Manual 



implode char- list 

implode is like maknam except that the returned symbol is interned in the current 
package. 



PS:<Uv1AN>H>STlU'KXT.27 8-JUN-84 



